-
Notifications
You must be signed in to change notification settings - Fork 3.5k
[TRT/TRT RTX EP] Fix bug for missing outputs in the returning ComputeCapability/IndexedSubGraph #26444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can commit the suggested changes from lintrunner.
|
@chilo-ms any more AIs needed for this to be merged? Thanks. |
| graph_outputs_to_add[output] = output_order; | ||
| // This output is the graph's output. | ||
| // So the output should be put into the subgraph's output list. | ||
| graph_outputs_to_add.insert({output, output_order}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the key already exists in any of the those maps, i.e. fused_inputs,
fused_outputs, fused_outputs_to_add and graph_outputs_to_add, it's not necessarily to override it.
input_order/output_order is simply a relative ordering that associated with the input/output, so that when constructing final sub_graph's input of output lists from above maps, input/output with smaller order index will appear before those with larger order indices.
So exact order index is not necessary, as long as the order index for an output that should appear before another output has smaller order index is sufficient.
BTW, i added the comments for input_order/output_order to explain the usage of them.
| if (graph.GetGraph().GetConsumerNodes(output->Name()).size() > 0) { | ||
| fused_outputs[output] = output_order++; | ||
| } | ||
| for (const auto& output : node->OutputDefs()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * | ||
| */ | ||
| { | ||
| Ort::Env env{ORT_LOGGING_LEVEL_WARNING, "test"}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
yuslepukhin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🕐
onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider.cc
Outdated
Show resolved
Hide resolved
| * |--- Mod ---> "labels" | ||
| */ | ||
| { | ||
| Ort::Env env{ORT_LOGGING_LEVEL_WARNING, "test"}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's already a global Ort::Env instance here - why do we need to define another one? the underlying OrtEnv is a singleton so this would refer to the same instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to use global ort_env
| return t; | ||
| } | ||
|
|
||
| OrtStatus* CreateModelWithNodeOutputNotUsed(const PathString& model_name) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for other tests that use models, we have checked in .onnx files and a script to generate them. is there a good reason to do it differently here? defining the model in a Python script (e.g., with onnxscript) could be more concise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, i added the python scripts as well as the models
onnxruntime/test/testdata/topk_and_multiple_graph_outputs.py
Dismissed
Show dismissed
Hide dismissed
| @@ -0,0 +1,43 @@ | |||
| import onnx | |||
Check notice
Code scanning / CodeQL
Module is imported with 'import' and 'import from' Note test
Module 'onnxruntime.test.onnx' is imported with both 'import' and 'import from'.
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 1 hour ago
To resolve the "Module is imported with both 'import' and 'import from'" issue, remove the from onnx import TensorProto, helper statement and reference TensorProto and helper via the onnx module (that is, use onnx.TensorProto and onnx.helper). Update all usages of helper and TensorProto in the code accordingly. No additional dependencies or code structure changes are required. Only lines in onnxruntime/test/testdata/node_output_not_used.py handling imports and references to helper and TensorProto need to be changed.
-
Copy modified lines R6-R8 -
Copy modified line R11 -
Copy modified line R13 -
Copy modified line R20 -
Copy modified line R27
| @@ -1,15 +1,14 @@ | ||
| import onnx | ||
| from onnx import TensorProto, helper | ||
|
|
||
|
|
||
| def create_model_with_node_output_not_used(model_path): | ||
| # Create graph | ||
| X = helper.make_tensor_value_info("X", TensorProto.FLOAT, [3, 2]) | ||
| W = helper.make_tensor_value_info("W", TensorProto.FLOAT, [2, 3]) | ||
| Y = helper.make_tensor_value_info("Y", TensorProto.FLOAT, [2, 3]) | ||
| X = onnx.helper.make_tensor_value_info("X", onnx.TensorProto.FLOAT, [3, 2]) | ||
| W = onnx.helper.make_tensor_value_info("W", onnx.TensorProto.FLOAT, [2, 3]) | ||
| Y = onnx.helper.make_tensor_value_info("Y", onnx.TensorProto.FLOAT, [2, 3]) | ||
|
|
||
| # Dropout node (two outputs) | ||
| dropout_node = helper.make_node( | ||
| dropout_node = onnx.helper.make_node( | ||
| "Dropout", | ||
| inputs=["X"], | ||
| outputs=["dropout_out", "dropout_mask"], | ||
| @@ -17,21 +10,21 @@ | ||
| ) | ||
|
|
||
| # MatMul node | ||
| matmul_node = helper.make_node( | ||
| matmul_node = onnx.helper.make_node( | ||
| "MatMul", | ||
| inputs=["dropout_out", "W"], | ||
| outputs=["Y"], | ||
| name="MatMulNode", | ||
| ) | ||
|
|
||
| graph = helper.make_graph( | ||
| graph = onnx.helper.make_graph( | ||
| nodes=[dropout_node, matmul_node], | ||
| name="DropoutMatMulGraph", | ||
| inputs=[X, W], | ||
| outputs=[Y], | ||
| ) | ||
|
|
||
| model = helper.make_model(graph, opset_imports=[helper.make_operatorsetid("", 13)]) | ||
| model = onnx.helper.make_model(graph, opset_imports=[onnx.helper.make_operatorsetid("", 13)]) | ||
|
|
||
| onnx.checker.check_model(model) | ||
| onnx.save(model, model_path) |
| @@ -0,0 +1,78 @@ | |||
| import onnx | |||
Check notice
Code scanning / CodeQL
Module is imported with 'import' and 'import from' Note test
Module 'onnxruntime.test.onnx' is imported with both 'import' and 'import from'.
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 1 hour ago
To address the issue, remove the from onnx import TensorProto, helper import, and instead refer to TensorProto and helper via the main namespace import: onnx.TensorProto and onnx.helper. This will make all references to ONNX symbols consistently qualified, improving code clarity. Specifically:
- Remove line 2:
from onnx import TensorProto, helper. - Change all references to
TensorProtoin this file toonnx.TensorProto. - Change all references to
helpertoonnx.helper.
No other functional changes are required. Only the one fileonnxruntime/test/testdata/topk_and_multiple_graph_outputs.pyneeds editing.
-
Copy modified line R8 -
Copy modified lines R13-R15 -
Copy modified line R20 -
Copy modified line R27 -
Copy modified line R34 -
Copy modified line R41 -
Copy modified lines R19-R22 -
Copy modified line R27 -
Copy modified line R31
| @@ -1,45 +1,44 @@ | ||
| import onnx | ||
| from onnx import TensorProto, helper | ||
|
|
||
|
|
||
| def create_model_with_topk_graph_output(model_path): | ||
| # ====================== | ||
| # ---- Inputs ---- | ||
| # ====================== | ||
| input_tensor = helper.make_tensor_value_info("input", TensorProto.FLOAT, ["N"]) | ||
| input_tensor = onnx.helper.make_tensor_value_info("input", onnx.TensorProto.FLOAT, ["N"]) | ||
|
|
||
| # ====================== | ||
| # ---- Initializers ---- | ||
| # ====================== | ||
| K = helper.make_tensor("K", TensorProto.INT64, dims=[1], vals=[300]) | ||
| zero = helper.make_tensor("zero", TensorProto.INT64, dims=[], vals=[0]) | ||
| twenty_six = helper.make_tensor("twenty_six", TensorProto.INT64, dims=[], vals=[26]) | ||
| K = onnx.helper.make_tensor("K", onnx.TensorProto.INT64, dims=[1], vals=[300]) | ||
| zero = onnx.helper.make_tensor("zero", onnx.TensorProto.INT64, dims=[], vals=[0]) | ||
| twenty_six = onnx.helper.make_tensor("twenty_six", onnx.TensorProto.INT64, dims=[], vals=[26]) | ||
|
|
||
| # ====================== | ||
| # ---- Nodes ---- | ||
| # ====================== | ||
| topk_node = helper.make_node( | ||
| topk_node = onnx.helper.make_node( | ||
| "TopK", | ||
| inputs=["input", "K"], | ||
| outputs=["scores", "topk_indices"], | ||
| name="TopK", | ||
| ) | ||
|
|
||
| less_node = helper.make_node( | ||
| less_node = onnx.helper.make_node( | ||
| "Less", | ||
| inputs=["topk_indices", "zero"], | ||
| outputs=["Less_output_0"], | ||
| name="Less", | ||
| ) | ||
|
|
||
| div_node = helper.make_node( | ||
| div_node = onnx.helper.make_node( | ||
| "Div", | ||
| inputs=["topk_indices", "twenty_six"], | ||
| outputs=["Div_17_output_0"], | ||
| name="Div", | ||
| ) | ||
|
|
||
| mod_node = helper.make_node( | ||
| mod_node = onnx.helper.make_node( | ||
| "Mod", | ||
| inputs=["topk_indices", "twenty_six"], | ||
| outputs=["labels"], | ||
| @@ -49,15 +16,15 @@ | ||
| # ========================= | ||
| # ---- Graph Outputs ---- | ||
| # ========================= | ||
| scores_out = helper.make_tensor_value_info("scores", TensorProto.FLOAT, ["K"]) | ||
| less_out = helper.make_tensor_value_info("Less_output_0", TensorProto.BOOL, ["K"]) | ||
| div_out = helper.make_tensor_value_info("Div_17_output_0", TensorProto.INT64, ["K"]) | ||
| labels_out = helper.make_tensor_value_info("labels", TensorProto.INT64, ["K"]) | ||
| scores_out = onnx.helper.make_tensor_value_info("scores", onnx.TensorProto.FLOAT, ["K"]) | ||
| less_out = onnx.helper.make_tensor_value_info("Less_output_0", onnx.TensorProto.BOOL, ["K"]) | ||
| div_out = onnx.helper.make_tensor_value_info("Div_17_output_0", onnx.TensorProto.INT64, ["K"]) | ||
| labels_out = onnx.helper.make_tensor_value_info("labels", onnx.TensorProto.INT64, ["K"]) | ||
|
|
||
| # ====================== | ||
| # ---- Graph ---- | ||
| # ====================== | ||
| graph = helper.make_graph( | ||
| graph = onnx.helper.make_graph( | ||
| nodes=[topk_node, less_node, div_node, mod_node], | ||
| name="TopKGraph", | ||
| inputs=[input_tensor], | ||
| @@ -65,7 +28,7 @@ | ||
| initializer=[K, zero, twenty_six], | ||
| ) | ||
|
|
||
| model = helper.make_model(graph, opset_imports=[helper.make_operatorsetid("", 13)]) | ||
| model = onnx.helper.make_model(graph, opset_imports=[onnx.helper.make_operatorsetid("", 13)]) | ||
|
|
||
| # Validate + Save | ||
| onnx.checker.check_model(model) |
Description
For TRT EP's
GetCapability(), in some case, theGetSubGraph()won't add graph's output to theComputeCapability/IndexedSubGraphreturning to ORT.The issue if from following code:
Update TRT RTX EP as well.
Motivation and Context
#25373